5 research outputs found
KERT: Automatic Extraction and Ranking of Topical Keyphrases from Content-Representative Document Titles
We introduce KERT (Keyphrase Extraction and Ranking by Topic), a framework
for topical keyphrase generation and ranking. By shifting from the
unigram-centric traditional methods of unsupervised keyphrase extraction to a
phrase-centric approach, we are able to directly compare and rank phrases of
different lengths. We construct a topical keyphrase ranking function which
implements the four criteria that represent high quality topical keyphrases
(coverage, purity, phraseness, and completeness). The effectiveness of our
approach is demonstrated on two collections of content-representative titles in
the domains of Computer Science and Physics.Comment: 9 page
Eventcube: Multi-dimensional search and mining of structured and text data
ABSTRACT A large portion of real world data are either text data or structured (e.g., relational) data. Moreover, such data are often linked together (e.g., structured specification of products linking with the corresponding product descriptions and customer comments). Even for text data such as news data, typed entities can be extracted with entity extraction tools. The EventCube project constructs TextCube and TopicCube from interconnected structured and text data (or from text data via entity extraction and dimension building), and performs multidimensional search and analysis on such datasets, in an informative, powerful, and userfriendly manner. This proposed EventCube demo will show the power of the system not only on the originally designed ASRS (Aviation Safety Report System) data sets, but also on news datasets, collected from multiple news agencies. The system has high potential to be extended in many powerful ways and serve as a general platform for search, OLAP (online analytical processing) and data mining on integrated text and structured data. After the system demo in the conference (if accepted), the system will be put on the web for public access and evaluation